104 research outputs found

    Critical assessment of methods of protein structure prediction: Progress and new directions in round XI

    Get PDF
    Modeling of protein structure from amino acid sequence now plays a major role in structural biology. Here we report new developments and progress from the CASP11 community experiment, assessing the state of the art in structure modeling. Notable points include the following: (1) New methods for predicting three dimensional contacts resulted in a few spectacular template free models in this CASP, whereas models based on sequence homology to proteins with experimental structure continue to be the most accurate. (2) Refinement of initial protein models, primarily using molecular dynamics related approaches, has now advanced to the point where the best methods can consistently (though slightly) improve nearly all models. (3) The use of relatively sparse NMR constraints dramatically improves the accuracy of models, and another type of sparse data, chemical crosslinking, introduced in this CASP, also shows promise for producing better models. (4) A new emphasis on modeling protein complexes, in collaboration with CAPRI, has produced interesting results, but also shows the need for more focus on this area. (5) Methods for estimating the accuracy of models have advanced to the point where they are of considerable practical use. (6) A first assessment demonstrates that models can sometimes successfully address biological questions that motivate experimental structure determination. (7) There is continuing progress in accuracy of modeling regions of structure not directly available by comparative modeling, while there is marginal or no progress in some other areas

    New encouraging developments in contact prediction: Assessment of the CASP11 results

    Get PDF
    This article provides a report on the state-of-the-art in the prediction of intra-molecular residue-residue contacts in proteins based on the assessment of the predictions submitted to the CASP11 experiment. The assessment emphasis is placed on the accuracy in predicting long-range contacts. Twenty-nine groups participated in contact prediction in CASP11. At least eight of them used the recently developed evolutionary coupling techniques, with the top group (CONSIP2) reaching precision of 27% on target proteins that could not be modeled by homology. This result indicates a breakthrough in the development of methods based on the correlated mutation approach. Successful prediction of contacts was shown to be practically helpful in modeling three-dimensional structures; in particular target T0806 was modeled exceedingly well with accuracy not yet seen for ab initio targets of this size (>250 residues

    Assessment of protein disorder region predictions in CASP10

    Get PDF
    A systematic analysis of intrinsic disorder in proteins started at the turn of the century1–4 and still remains a hot research topic.5 Only this year several papers covering general aspects of protein disorder have been published5– 9 and the discussion on the fundamental principles of disorder continues to unfold.10,11 PubMed search with the keywords “intrinsically disordered protein 2012” and “intrinsically disordered protein 2013” returned 525 and 305 entries, respectively (as of April 2013). The number of experimentally verified intrinsically disordered proteins and regions is steadily increasing. The DisProt database12 currently contains annotations for 684 intrinsically disordered proteins, 1513 disordered regions, and describes 38 different biological functions associated with disordered regions. The more recently established IDEAL database also has a number of useful annotations on disordered proteins.13 Such a high interest in this area of research triggered rapid development of computational methods for prediction of the location of disordered regions in proteins. The recently published reviews and assessment papers14–18 altogether provide a comprehensive analysis of more than fifty disorder prediction methods. An independent assessment of the protein disorder methods within the scope of CASP started in 2002 and is now already in its sixth round.18–22 This study analyzes the results obtained by the 28 disorder prediction groups participating in CASP10

    Evaluation of template-based models in CASP8 with standard measures

    Get PDF
    The strategy for evaluating template-based models submitted to CASP has continuously evolved from CASP1 to CASP5, leading to a standard procedure that has been used in all subsequent editions. The established approach includes methods for calculating the quality of each individual model, for assigning scores based on the distribution of the results for each target and for computing the statistical significance of the differences in scores between prediction methods. These data are made available to the assessor of the template-based modeling category, who uses them as a starting point for further evaluations and analyses. This article describes the detailed workflow of the procedure, provides justifications for a number of choices that are customarily made for CASP data evaluation, and reports the results of the analysis of template-based predictions at CASP8

    A Comprehensive Analysis of the Structure-Function Relationship in Proteins Based on Local Structure Similarity

    Get PDF
    BACKGROUND:Sequence similarity to characterized proteins provides testable functional hypotheses for less than 50% of the proteins identified by genome sequencing projects. With structural genomics it is believed that structural similarities may give functional hypotheses for many of the remaining proteins. METHODOLOGY/PRINCIPAL FINDINGS:We provide a systematic analysis of the structure-function relationship in proteins using the novel concept of local descriptors of protein structure. A local descriptor is a small substructure of a protein which includes both short- and long-range interactions. We employ a library of commonly reoccurring local descriptors general enough to assemble most existing protein structures. We then model the relationship between these local shapes and Gene Ontology using rule-based learning. Our IF-THEN rule model offers legible, high resolution descriptions that combine local substructures and is able to discriminate functions even for functionally versatile folds such as the frequently occurring TIM barrel and Rossmann fold. By evaluating the predictive performance of the model, we provide a comprehensive quantification of the structure-function relationship based only on local structure similarity. Our findings are, among others, that conserved structure is a stronger prerequisite for enzymatic activity than for binding specificity, and that structure-based predictions complement sequence-based predictions. The model is capable of generating correct hypotheses, as confirmed by a literature study, even when no significant sequence similarity to characterized proteins exists. CONCLUSIONS/SIGNIFICANCE:Our approach offers a new and complete description and quantification of the structure-function relationship in proteins. By demonstrating how our predictions offer higher sensitivity than using global structure, and complement the use of sequence, we show that the presented ideas could advance the development of meta-servers in function prediction

    Assessment of chemical-crosslink-assisted protein structure modeling in CASP13

    Get PDF
    International audienceWith the advance of experimental procedures obtaining chemical crosslinking information is becoming a fast and routine practice. Information on crosslinks can greatly enhance the accuracy of protein structure modeling. Here, we review the current state of the art in modeling protein structures with the assistance of experimentally determined chemical crosslinks within the framework of the 13th meeting of Critical Assessment of Structure Prediction approaches. This largest‐to‐date blind assessment reveals benefits of using data assistance in difficult to model protein structure prediction cases. However, in a broader context, it also suggests that with the unprecedented advance in accuracy to predict contacts in recent years, experimental crosslinks will be useful only if their specificity and accuracy further improved and they are better integrated into computational workflows

    Target highlights in CASP9: Experimental target structures for the critical assessment of techniques for protein structure prediction

    Get PDF
    15 pags, 9 figsOne goal of the CASP community wide experiment on the critical assessment of techniques for protein structure prediction is to identify the current state of the art in protein structure prediction and modeling. A fundamental principle of CASP is blind prediction on a set of relevant protein targets, that is, the participating computational methods are tested on a common set of experimental target proteins, for which the experimental structures are not known at the time of modeling. Therefore, the CASP experiment would not have been possible without broad support of the experimental protein structural biology community. In this article, several experimental groups discuss the structures of the proteins which they provided as prediction targets for CASP9, highlighting structural and functional peculiarities of these structures: the long tail fiber protein gp37 from bacteriophage T4, the cyclic GMP-dependent protein kinase Iβ dimerization/docking domain, the ectodomain of the JTB (jumping translocation breakpoint) transmembrane receptor, Autotaxin in complex with an inhibitor, the DNA-binding J-binding protein 1 domain essential for biosynthesis and maintenance of DNA base-J (β-D-glucosyl-hydroxymethyluracil) in Trypanosoma and Leishmania, an so far uncharacterized 73 residue domain from Ruminococcus gnavus with a fold typical for PDZ-like domains, a domain from the phycobilisome core-membrane linker phycobiliprotein ApcE from Synechocystis, the heat shock protein 90 activators PFC0360w and PFC0270w from Plasmodium falciparum, and 2-oxo-3-deoxygalactonate kinase from Klebsiella pneumoniae. © 2011 Wiley-Liss, Inc.Grant sponsor: Spanish Ministry of Education and Science; Grant number: BFU2008-01588; Grant sponsor: European Commission; Grant number: NMP4-CT-2006-033256; Grant sponsor: Spanish Ministry of Education and Science (José Castillejo fellowship); Grant sponsor: Xunta de Galicia (Angeles Alvariño fellowship); Grant sponsor: National Institutes of Health; Grant numbers: K22-CA124517 (D.E.C.); R01-GM090161 (C.K.) GM074942; GM094585; Grant sponsor: U. S. Department of Energy, Office of Biological and Environmental Research; Grant number: DE-AC02-06CH11357 (to A.J.); Grant sponsor: Foundation for Polish Science (to K.M.); Grant sponsor: NSF; Grant number: DBI 0829586
    corecore